--- layout: page title: Exercise 2 permalink: /scripts/exercise2/ parent: R Scripts nav_order: 3 --- Exercise 2


Excercise: Why Don’t the Poor Save More? Evidence from Health Savings Experiments

Researchers designed a field experiment in rural Kenya to investigate why the poor are constrained in their ability to save money. In this experiment, researchers randomly varied individuals’ access to technology that would enable greater security of investment. By observing the impact of these technologies on the amount of money saved, researchers were able to identify key barriers to saving.

This exercise is based on: Dupas, Pascaline and Jonathan Robinson. 2013. “Why Don’t the Poor Save More? Evidence from Health Savings Experiments.” American Economic Review, 103(4): 1138-1171, http://dx.doi.org/10.1257/aer.103.4.1138.

They worked with 113 ROSCAs (Rotating Savings and Credit Associations). A ROSCA is a group of individuals who come together and make regular cyclical contributions to a fund (called the “pot”), which is then given as a lump sum to one member in each cycle. In their experiment, Dupas and Robinson randomly assigned 113 ROSCAs to one of five study arms. In this exercise, we will focus on three study arms (one control and two treatment arms). The data file, rosca.csv is extracted from their original data, excluding individuals who have received multiple treatments for the sake of simplicity.

Individuals in all study arms were encouraged to save for investments in preventative health products and were asked to set a health goal for themselves at the beginning of the study. They were also assigned randomly to a treatment condition:

• In the first treatment group (Safe Box), respondents were given a box locked with a padlock, and the key to the padlock was provided to the participants. They were asked to record what health product they were saving for and its cost. This treatment is designed to estimate the effect of having a safe and designated storage technology for preventative health savings.

• In the second treatment group (Locked Box), respondents were given a locked box, but not the key to the padlock. The respondents were instructed to call the program officer once they had reached their saving goal, and the program officer would then meet the participant and open the Locked Box at the shop where the product is purchased.

• The point here is that compared to the safe box, the locked box offered a stronger commitment through earmarking (the money saved could only be used for the pre-specified purpose).

Participants are interviewed again 6 months and 12 months later. In this exercise, our outcome of interest is the amount (in Kenyan shilling) spent on preventative health products after 12 months: fol2_amtinvest.

Name Description
bg_female 1 if female, and 0 otherwise
bg_married 1 if married, and 0 otherwise
bg_b1_age age at baseline
encouragement 1 if encouragement only (control group), and 0 otherwise
safe_box 1 if safe box treatment, and 0 otherwise
locked_box 1 if lock box treatment, and 0 otherwise
fol2_amtinvest Amount invested in health products
has_followup2 1 if appears in 2nd followup (after 12 months), and 0 otherwise

Task 1

As with any new dataset, it is important to first get acquainted with its structure and its key variables.

• Load the data set.

• How many participants are there in total?

• What is the percentage of male and female participants?

• How many participants are married?

• What is the mean age of participants?

Hint: Use read.csv() to load file and describe dataset using nrow() or dim(). For the rest of questions try out different approaches. Use summary() on the dataset; use mean() on variables; can table() help?

Task 2

• Create a single factor variable treatment that takes the value control if participants received only encouragement, safebox if received a safe box, and lockbox if they received a locked box.

• How many individuals are in the control group? How many individuals are in each of the treatment arms?

• Finally use the table() command on the new variable.

Hint: Create a new variable in the dataset called treatment that takes values control, safebox and lockbox, depending on whether encouragement, safe_box and locked_box have the value of 1 respectively in the original data. There are two ways of doing it, as we did above:

• Creating a null treatment variable, and inputting control, safebox, and lockbox values through indexing;

• Using a nested ifelse() command.

• Try both approaches and compare your results.

Task 3

• Subset the data so that it contains only participants who were interviewed 12 months later during the second followup. Use this subset for the subsequent analyses.

• How many participants are left in each treatment group of this subset?

• Calculate drop-out rates for each group. Does the drop-out rate differ across the treatment conditions?

• What does this result suggest about the internal and external validity of this study?

Hint: Use the subset() function to create a new dataset and use the table() command on your treatment variable. table() can also be used to calculate the rates of saving.